Heuristic Approach to Manage Semantic Heterogeneity and Data Inconsistency in Enterprise Data Integration

نویسندگان

  • Srinivasan Shanmugam
  • Lixin Tao
چکیده

XML syntax and semantic validations are critical to the correct service transaction specification and service integration based on existing distributed and heterogeneous computing services. Current industry practice of XSLT-based Schematron validation may produce invalid results, and contributes a reusable XML validator component that supports sound integrated syntax/semantic validations and event-driven integration with its environment through public APIs. XML is a selfdescribing language, and data owners do not follow a standard in XML elements/attributes. i.e., data owners have freedom to define their own tags/attributes and nesting orders in XML, but, this inconsistency leads to constraint management inefficiency (redundant constraints/expensive reformulations). Hence the challenge, if constraints are specified to the concepts that can be applied to different XML syntax structure. This will impact the semantic validations flexibility. Heuristics to identify possible semantic heterogeneity between XML documents that have any syntactic difference and data inconsistency were also proposed. In this paper, we propose a heuristic-based mechanism to manage the data consistency and semantic heterogeneity that achieves data interoperability by ensuring more flexibility/efficiency in enterprise data integration. Keywords-XML; OWL-Schematron; co-constraint; syntax validation; semantic validation; integrated validation; co-constraint; conceptual validation; semantic heterogeneity;

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MAXSM: A Multi-Heuristic Approach to XML Schema Matching

Transformation of business messages from one trading partner’s definition to another, or from one business message type to another is a common requirement for enterprise data integration applications. Transforming these business messages entails resolving issues of structural and semantic heterogeneity between their schemas. In this paper, we propose an automatic schema matching approach called...

متن کامل

Adaptive Information Analysis in Higher Education Institutes

Information integration plays an important role in academic environments since it provides a comprehensive view of education data and enables mangers to analyze and evaluate the effectiveness of education processes. However, the problem in the traditional information integration is the lack of personalization due to weak information resource or unavailability of analysis functionality. In this ...

متن کامل

Adaptive Information Analysis in Higher Education Institutes

Information integration plays an important role in academic environments since it provides a comprehensive view of education data and enables mangers to analyze and evaluate the effectiveness of education processes. However, the problem in the traditional information integration is the lack of personalization due to weak information resource or unavailability of analysis functionality. In this ...

متن کامل

An Approach to Eliminate Semantic Heterogenity Using Ontologies in Enterprise Data Integeration

XML syntax and semantic validations are critical to the correct service transaction specification and service integration based on existing distributed and heterogeneous computing services. Current industry practice of XSLT-based Schematron validation may produce invalid results, and contributes a reusable XML validator component that supports sound integrated syntax/semantic validations and ev...

متن کامل

A Technique for Information System Integration

Nowadays, a central topic in database science is the need of an integrated access to large amounts of data provided by various information sources whose contents are strictly related. Often information sources have been designed independently for autonomous applications, so they may present several kinds of heterogeneity. Particularly hard to manage is the semantic heterogeneity, which is due t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014